Search CORE

1,067 research outputs found

Dynamic instruction scheduling and data forwarding in asynchronous superscalar processors

Author: Mullins Robert D.
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Recommended from our members

The case for limited-preemptive scheduling in GPUs for real-time systems

Author: Mullins Robert
Spliet R
Publication venue: 14th annual workshop on Operating Systems Platforms for Embedded Real-Time applications
Publication date: 06/07/2018
Field of study

Many emerging cyber-physical systems, such as autonomous vehicles, have both extreme computation and hard latency requirements. GPUs are being touted as the ideal platform for such applications due to their highly parallel organisation. Unfortunately, while offering the necessary performance, GPUs are currently designed to maximise throughput and fail to offer the necessary hard real-time (HRT) guarantees. In this work we discuss three additions to GPUs that enable them to better meet real-time constraints. Firstly, we provide a quantitative argument for exposing the non-preemptive GPU scheduler to software. We show that current GPUs perform hardware context switches for non-preemptive scheduling in 20-26.5μs on average, while swapping out 60-270KiB of state. Although high, these overheads do not forbid non-preemptive HRT scheduling of real-time task sets. Secondly, we argue that limited-preemption support can deliver large benefits in schedulability with very minor impact on the context switching overhead. Finally, we demonstrate the need for a more predictable DRAM request arbiter to reduce interference caused by processes running on the GPU in parallel

Apollo (Cambridge)

On the Reduction of Computational Complexity of Deep Convolutional Neural Networks.

Author: Maji Partha
Mullins Robert
Publication venue: Entropy (Basel)
Publication date: 01/04/2018
Field of study

Deep convolutional neural networks (ConvNets), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks. Unfortunately, achieving accuracy often implies significant computational costs, limiting deployability. In modern ConvNets it is typical for the convolution layers to consume the vast majority of computational resources during inference. This has made the acceleration of these layers an important research area in academia and industry. In this paper, we examine the effects of co-optimizing the internal structures of the convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speedup of a ConvNet, achieving a ten-fold increase over baseline. We also introduce a new class of fast one-dimensional (1D) convolutions for ConvNets using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well-grounded, robust, and does not require any time-consuming retraining, while still achieving speedups solely from convolutional layers with no loss in baseline accuracy

Directory of Open Access Journals

Apollo (Cambridge)

Bruch’s Membrane: The Critical Boundary in Macular Degeneration

Author: Elliott H. Sohn
Robert F. Mullins
Publication venue: 'IntechOpen'
Publication date: 20/01/2012
Field of study

IntechOpen

Configurable memory systems for embedded many-core processors

Author: Bates Daniel
Chadwick Alex
Mullins Robert
Publication venue: http://arxiv.org/abs/1601.00894
Publication date: 05/01/2016
Field of study

The memory system of a modern embedded processor con- sumes a large fraction of total system energy. We explore a range of different configuration options and show that a reconfigurable design can make better use of the resources available to it than any fixed implementation, and provide large improvements in both performance and energy con- sumption. Reconfigurability becomes increasingly useful as resources become more constrained, so is particularly rele- vant in the embedded space. For an optimised architectural configuration, we show that a configurable cache system performs an average of 20% (maximum 70%) better than the best fixed implementation when two programs are competing for the same resources, and reduces cache miss rate by an average of 70% (maximum 90%). We then present a case study of AES encryption and decryption, and find that a custom memory configuration can almost double performance, with further benefits being achieved by specialising the task of each core when parallelising the program

arXiv.org e-Print Archive

Apollo (Cambridge)

Formalizing Reasons, Oughts, and Requirements

Author: Mullins Robert
Publication venue
Publication date: 01/01/2021
Field of study

Reasons-based accounts of our normative conclusions face difficulties in distinguishing between what ought to be done and what is required. This article addresses this problem from a formal perspective. I introduce a rudimentary formalization of a reasons-based account and demonstrate that that the model faces difficulties in accounting for the distinction between oughts and requirements. I briefly critique attempts to distinguish between oughts and requirements by appealing to a difference in strength or weight of reasons. I then present a formalized reasons-based account of permissions, oughts and requirements. The model exploits Joshua Gert (2004; 2007) and Patricia Greenspan’s (2005; 2007; 2010) suggestion that some reasons perform a purely justificatory function. I show that the model preserves the standard entailment relationships between requirements, oughts and permissions

PhilPapers

Augmentation Backdoors

Author: Mullins Robert
Rance Joseph
Shumailov Ilia
Zhao Yiren
Publication venue
Publication date: 29/09/2022
Field of study

Data augmentation is used extensively to improve model generalisation. However, reliance on external libraries to implement augmentation methods introduces a vulnerability into the machine learning pipeline. It is well known that backdoors can be inserted into machine learning models through serving a modified dataset to train on. Augmentation therefore presents a perfect opportunity to perform this modification without requiring an initially backdoored dataset. In this paper we present three backdoor attacks that can be covertly inserted into data augmentation. Our attacks each insert a backdoor using a different type of computer vision augmentation transform, covering simple image transforms, GAN-based augmentation, and composition-based augmentation. By inserting the backdoor using these augmentation transforms, we make our backdoors difficult to detect, while still supporting arbitrary backdoor functionality. We evaluate our attacks on a range of computer vision benchmarks and demonstrate that an attacker is able to introduce backdoors through just a malicious augmentation routine.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive